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Modelling and prediction of weld bead geometry is an important issue in 
robotic GMAW process. This process is highly non-linear and coupled 
multivariable system and the relationship between process parameters and 
weld bead geometry cannot be defined by an explicit mathematical 
expression. Therefore, application of supervised learning algorithms can be 
useful for this purpose. Support vector machine is a very successful approach 
to supervised learning. In this approach, a higher degree of accuracy and 
generalization capability can be obtained by using the multiple kernel 
learning framework, which is considered as a great advantage in prediction 
of weld bead geometry due to the high degree of prediction accuracy 
required. In this paper, a novel approach for modelling and prediction of the 
weld bead geometry, based on multiple kernel support vector regression 
analysis has been proposed, which benefits from a high degree of accuracy 


and generalization capability. This model can be used for proper selection of 
welding parameters in order to obtain a desired weld bead geometry in 
robotic GMAW process. 
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1. INTRODUCTION 

In today's industrial world, increased competition in the global economy has led to increasing need 
forquick delivery of customized products of manufacturers to meetcustomer demands [1]. To this end, the era 
of rapid prototyping emerged in 1988 and quickly developed based on rapid advancesof digital computing 
systems and advanced technologies such as laser. This novel innovative technology was successful to 
overcome the shortages of traditional prototyping methodsand resulted in remarkable reduction of prototypes 
production time [2]. In rapid prototyping, construction or assembly of theconsisting parts is usually 
performed by application of additive manufacturing technology, in which a three-dimensional object is 
created byaddition of layer-upon-layer of materialsunder computer control. 

In recent years, plenty of studies have been presented with regard to rapid prototyping of metallic 
parts based on gas metal arc welding (GMAW) [3]. In this process, an electric arc between a consumable 
wireelectrode and the metallic workpiece is generated, which heats the workpiece and causes them to melt 
and join.A robotic system is applied in some of the GMAW processes to control the deposit of welded 
material, position of welding torch and someother parameterssuch as slope and rotation of the torch. A 
schematic view of the robotic GMAW process is depicted in Figure 1 [4]. 

In welding processes, weld bead geometry, namely weld width and weld bead height, as shown in 
Figure 2 is one of the important characteristics of the welding line. Especially, in the GMAW process, the 
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beadgeometry has a significant effect on the layer thickness, surface quality, and dimensional precision. The 
most important parameters which affect thebead geometry are welding current and voltage, type and 
percentage of inert gas, and the distance of nozzle to the workpiece. 

As appropriate weld bead geometry results in high weld quality, proper selection of welding 
parameters so as to obtain a desired weld geometry is of great attention in GMAW process. To this end, it is 
greatly important to develop a global model of weld geometry based on the process parameters. Due to the 
highly non-linear and coupled multivariable effect of the input parameters on the weld geometry, this model 
cannot be defined through an explicit mathematical expression and advanced modelling techniques are 
investigated for this purpose. 
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Figure 1. Schematic View of the robotic GMAW Process [4] 
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Figure 2. Weld bead geometry 


Machine learning is a subfield of computer science, in which the study and construction of 
algorithms, capable of learning from and making predictions based on a limited set of observed data is 
explored. In such algorithms a model is built from example inputs in order to make data-driven predictions or 
decisions. Supervised learning is the machine learning task of inferring a function from a set of labeled 
training data [5]. The algorithms in this field can be used to establish a model based on a limited set of 
observations for making predictions in cases which have not observed. Therefore, these algorithms can be 
used for modelling and prediction of weld geometry in GMAW process and several researches have been 
performed in this field which are mainly based on neural networks and fuzzy systems [6]. In field robotic 
GMAW process, a global database of process parameters and the corresponding weld geometry has been 
provided by [7] and predictive modelling has been performed by both the neural network and second order 
regression analysis methods, which proves the higher accuracy of theneural networkapproach over the second 
order regression. 

Support vector machine (SVM) is a state-of-the-art approach to supervised learning, used for 
classification and regression analysis, which has been proven as a powerful method in many practical 
applications [8]. Structural risk minimization alongside with empirical risk minimization is the main 
advantage of the SVMs over the neural networks resulting in a better generalization capability in many 
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problems [9]. The accuracy of SVM-based modelling can be enhanced by a newer approach known as 
multiple kernel learning, which is introduced in Section 3 [10]. Regarding to the high degree of accuracy 
required in prediction of weld bead geometry in robotic GMAW process, application of multiple kernel 
support vector machine for this prediction has been discussed in this paper and this approach has been proven 
to provide more accuracy and generalization capability. 


2. SUPPORT VECTOR MACHINE 

Support vector machines (SVMs) are supervised learning models with associated learning algorithm 
swhich analyze data and recognize patterns, used for classification and regression analysis [11]. A linear 
SVM-based classifier system finds the hyper-plane which leads to the maximum margin between the samples 
of the two classes in the training dataset, while minimizing the classification error. Such a classifier can be 
described by Equation (1), in which x is the input vector and w and b are the weights and bias vectors, 
respectively [12]. The optimum values of w and b are obtained by minimization of the risk function R(w) 
expressed in Equation (2), subjected to the constraints of Equation (3), for the N samples of the (x;, y;) in the 
training dataset [13]. 


f(x) = sign(w’ x +b) (1) 
R(w) = 5 liwli? + CDM & (2) 
yi(w'x;, +b) 21- &, i=1,...,N (3) 


In the risk function of Equation (2), the first term stands for the structural risk, i.e. the margin 
between the two classes and the second term stands for the empirical risk, i.e. the training error. The 
parameter C, is the regularization factor and it trades off the relative importance of maximizing the structural 
and empirical errors. Figure 3 shows the SVM-based classification. 
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Figure 3. SVM-based classification 


In case of data with nonlinear border between the two classes the original feature space can be 
mapped to some higher-dimensional feature space where the training set is separable, through a nonlinear 
function known as the kernel function, as depicted in Figure 4 [14]. 


Figure 4. Mapping of the feature space based on a kernel function 
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Therefore SVM-based classification can be expressed as: 


f(x) = sign(w" pœ) +b) (4) 


In which w and b are obtained by minimizing the risk function R(w) in Equation (5) subjected to the 
constraints of: 


yilw' p(x) +b) 21- & (5) 


The concept of SVM classification can be generalized for the purpose of regression by introducing 
the margin of tolerance for the function to be estimated, based on the permitted estimation error. Given a 
limited number ofobservations from the function f(x) with the permitted margin of tolerance £, SVM-based 
classification between f(x) + € and f(x) - € can be considered as estimating f(x) in the permitted margin of 
tolerance, as depicted in Figure 5 [15]. In other words, in SVM-based regression (SVR), the input space is 
mapped into a highdimensional feature space via the kernel function and then a linear optimal regression is 
performed in this space. Therefore, the formulation of support vector machines can be generalized for the 
purpose of regression as: 


y = f(x) = LE widi(x) +b; = wi h(x) +b (6) 
=) ETE 
b f 
f— e 


Figure 5. Generalization of SVM-based classification to SVM-based regression [15] 


The optimal regression is obtained by maximizing the L(q;, aj) function in Equation (7) subjected 
to the constraints given by Equation (8) [15]. 


(ai = i)a = aj (OCD, 6(4))} - 


r : (7) 
e Dila + a7) + Dil yila — a7) 


; 1 
L(a ai) =- SDi JA. 


{ Lita, =O 


a;,a; €[0,C], i=0,..,N (8) 


According to the Mercer’s theorem [14], the inner product (g(x), g(x;)) can be defined through a 
kernel functionas K (xi, x;) = (d(x), (x;)).Therefore, the L(a;,a;) function in Equation (7) can be 
expressed as: 


{(a; — a7) (a; — a7) K (x;,;)} - 


z 7 (9) 
eiL (lai + aj) + Xi yila — a7) 


. 1 
L(a a}) = — Yin Dj 


The optimization problem can be solved via quadratic programmingoptimization and the estimated 
function is expressed based on the optimal values as Equation (10) [16]. 


f(x) = Dia; — af) K(x,%;) +b (10) 


The most common formulations for the kernel function are listed in Table 1. 
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Table 1. Most Common Formulations for the Kernel Function 


Kernel type Formulation 
Gaussian radial basis (RBF) K(x, x;) = exp(— IIx 5 ail ) 
to 
Polynomial of degree d K(x, x) = ((x,x;)+p)¢ dEN 
Multi-Layer Perceptron (MLP) K(x, x;) = tanh(k.(x,x;) + 0) k90 >0 


3. SUPPORT VECTOR REGRESSION BASED ON MULTIPLE KERNEL LEARNING 

In SVM-based regression, the performance of the learning algorithm highly depends on the data 
representation, which is chosen through the kernel function. Kernel function measures the nonlinear 
similarity between samples, so an efficient kernel should represent data adaptively. In addition, an 
appropriate regularization term is defined for the learning problem in terms of the kernel function's 
parameters. In most cases, the parameters of a single kernel function is tuned for the whole data sets. 
Although the kernel parameter can be optimally chosen to enhance the generalization capability, learning 
with single kernel is not very data-adapted or discriminative. Multiple kernel learning (MKL) provides a 
more flexible framework than single kernel and mines data information more adaptively and more effectively 
[17]. In the MKL framework, the kernel function is formed based a linear convex combination of M 
functions which satisfy the Mercer's conditions, formulated as: 


K(x, xi) = Xm=1 dmKm (x, xi) (11) 
where dpn is the weight of the m-th basis kernel function and must satisfy the conditions of: 
Moidm=1, dm20 (12) 
m=1 “m , m= 


The combining weights are considered as a vector of weights, namelyd = [d,,...,dy]". 

The multiple kernel learning (MKL) problem can be described as learning the combining weights 
dmand the solutions of the original problem, for example, the solutions of a; and a; for SVR problem in 
Equation (15), in a single optimization problem. By substitution of Equation (11) into Equation (9), the 
optimization problem of MKL-based SVR is obtained as maximization ofL(q@;,a@;) in Equation (13) 
subjected to the constraints of Equation (14) [17]. 


{(a; -= aj )(a; 5 a) nai dK} = 


1 
L(@, af) =- 2" 5" 
os 2AIL DN (a, ta?) + DM, yila- af) 


(13) 
Dik1 a; — aj = 0 

apa € [0,C], i=0,...,N (14) 
, dm >0,m=1,...,M 


mai dm = 

Recently, a simple and efficient algorithm for multiple kernel learning has been proposed by which 

solves the optimization problem by application of the gradient descent method [18]. This approach, known as 
Simple MKL, is based on the fact that the objective function L in Equation (13) is convex and differentiable. 


Therefore, the optimum vector of weights d can be obtained by means of updating it on the gradient descent 
direction of L. In this method, the gradient of objective function is computed by the derivatives of L as 


ðL 


Idm = EK jar (@; = a) (aj = aj )Km (xi x) (15) 


In progress, the descent direction D of gradients is found and d is updated as: 
d«-d+yD (16) 


where y is the step length. The gradient of the objective function is only updated when the objective value 
decreases. This update procedure is repeated until the stopping criterion is met [18]. 
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4. RESULTS AND DISCUSSION 

In order to establish the MK-SVR predictive models, a database of measured values of weld bead 
geometry together with the corresponding process parameters, provided by Xiong, et al.was utilized, which is 
shown in Table 2 [7]. From this database, the first thirty one samples were used to train the MK-SVR models 
and the predictable accuracy of the established models was evaluated based on the next twelve samples, 
marked in bold. To improve the accuracy, all the input and target values were normalized between —1 and +1 
as: 


F p- (mem) 


(max-min) 


Pn = (17) 


where, max and min are respectively the maximum or minimum value of the input or the output among the 
whole dataset, p is the input or output and p,, is the corresponding normalized value.Based on the normalized 
dataset, the single kernel and multiple kernel SVM models were implemented by the SVM-KM [20] and the 
SimpleMKLMatlab toolboxes, respectively. Training the models and calculating the predicted normalized 
outputs, they were scaled to their original range, as: 


yy = yp * eo) + ee) (18) 


2 


where 7 is the predicted output in the original range and y, is the normalized predicted output.For predictive 
modelling of the bead width, the kernel function was selected as combination of 401 Gaussian basis kernel 
functions with parameters varying from 8 with increment of 0.01 to 12 and in case of the bead height it was 
selected as combination of three polynomial basis functions with parameters of 1,2,3 and 81 Gaussian basis 
kernel functions with parameters varying from 0.2 with increment of 0.01 to 1. The single kernel models 
were implemented based on the Gaussian kernel function. The parameters of single kernel and multiple 
kernel models are listed in Table 3. 


Table 2. The GMAW Process Parameters and the Corresponding Values of Weld Geometry [7] 


Experiment Wire feed rate Welding speed Nozzle to plate Bead width Bead height 
No (m/min) (cm/min) Arc voltage (V) distance (mm) (mm) (mm) 
1 5:2 22.5 17.5 9 8.95 2.88 
2 3.6 22.5 17.5 9 10.72 3:35 
3 5.2 37.5 17.5 9 7.19 2.45 
4 3.6 37.5 17.5 9 8.29 2.75 
5 5.2 225 20.5 9 10.25 2.66 
6 3.6 22.5 20.5 9 11.5 3.26 
7 5.2 37.5 20.5 9 8.36 2.17 
8 3.6 37.5 20.5 9 9.35 2.58 
9 5.2 22.5 17.5 15 8.36 3 

10 3.6 22.5 17.5 15 9.52 3.56 
11 5.2 37.5 17.5 15 6.83 2.45 
12 3.6 37.5 17.5 15 7.98 2.9 
13 5.2 22.5 20.5 15 9.92 2.79 
14 3.6 22.5 20.5 15 11.12 3:35 
15 52 37.5 20.5 15 7.91 2.26 
16 2.8 37.5 20.5 15 9.25 2.7 
17 6 30 19 12 7.39 2.32 
18 4.4 30 19 12 9.9 3.28 
19 4.4 15 19 12 11.76 3.8 
20 4.4 45 19 12 7.54 2.34 
21 4.4 30 16 12 8.08 2.94 
22 4.4 30 22 12 9.9 2.45 
23 4.4 30 19 6 9.51 2.77 
24 4.4 30 19 18 8.58 2.83 
25 4.4 30 19 12 8.88 2.75 
26 4.4 30 19 12 9.09 2.83 
27 4.4 30 19 12 8.92 2.19 
28 4.4 30 19 12 8.91 2.75 
29 4.4 30 19 12 8.92 2.83 
30 4.4 30 19 12 9.02 2.81 
31 4.4 30 19 12 8.8 2.8 
32 4 21 17 12 8.798 3.346 
33 4 27 17 12 8.899 2.961 
34 4 30 17 12 7.954 2.854 
35 4 36 17 12 7.249 2.662 
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Experiment Wire feed rate Welding speed Nozzle to plate Bead width Bead height 
No (m/min) (cm/min) Arc voltage (V) distance (mm) (mm) (mm) 

36 4 39 17 12 7.193 2.533 

37 4 27 18.9 12 10.002 3.218 

38 5.2 30 18.9 12 9.116 3.047 

39 6 27 20.3 12 11.233 3.304 

40 5.2 22.5 19 12 10.61 3.411 

41 5.2 37.5 19 12 8.494 2.79 

42 4.4 37.5 17.5 12 7.788 2.811 

43 6 37.5 20.5 12 9.849 2.876 


Table 3. The Single Kernel and Multiple Kernel SVM Parameters 


Parameter Method Bead width Bead height 
kernel parameter (o?) single kernel 1 0.2 
ree multiple kernel 7100 100 
regularization factor (C) single kernel 10 25000 
insensitivity parameter (£) multiple kernel 0.009 ya 
yP single kernel 107 107 


Accuracy of the final models was evaluated based on the root means square error (RMSE), 
normalized root means square error (NRMSE) and mean absolute percentage error (MAPE) statistical 
indices, defined as: 


N = 
RMSE = Din Vi vi)? (19) 
N 


RMSE 


NRMSE = (20) 


N a 
Yi 


MAPE = * 100% (21) 


In these equations, y; and ¥, are the corresponding measured and the predicted outputs, respectively, 
N is the corresponding number of training or testing samples and Y is the mean value of the total measured 
outputs. The calculated values of the indices are listed in Table 4. Besides the superior performance of MK- 
SVR over the SK-SVR, this method has a better testing mean absolute percentage error (MAPE) than the 
ANN-based approach proposed by [7] which has reported a MAPEof 2.013% for the test data. 

The SK-SVR testing root means square error can be further reduced to 0.1493 by changing the SVM 
kernel and model parameters, but this value for kernel parameter cannot be obtained from the training 
database. In other words, the SK-SVR model is overlearned in training process and cannot be trained to make 
the best predictions for the test data besides the training data. Therefore, the MK-SVR method benefits from 
better generalization capability as well as higher precision for the test data. The measured outputs together 
with the outputs predicted by the MK-SVM method are depicted in Figure 6 and a good agreement can be 
observed between them. 


Table 4. Calculated Values of the Statistical Indices for the Training and Test Data 


Database Method RMSE NRMSE MAPE 
Training MK-SVM 0.1478 0.0249 0.92% 
SK-SVM 0.0316 0.0053 0.18% 

Testing MK-SVM 0.1507 0.0254 1.77% 

SK-SVM 0.3291 0.0555 5.52% 
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Figure 6. Predicted and measured values for A) The bead height B) The bead width 


5. CONCLUSION 

In this paper, application of multiple kernel SVM regression analysis has been proposed for 
modelling and prediction of the weld geometry in robotic GMAW-based rapid manufacturing process based 
on the input parameters of wire feed rate, welding speed, arc voltage and nozzle to plate distance. In this 
analysis, the kernel function is formed based on a linear combination of basis kernel functions and using the 
Simple MKL algorithm, the optimized combination of the kernel function and the solutions of the SVR 
problem are obtained.Based on the results, it has been concluded that the best prediction results cannot be 
obtained in single kernel SVR for both the training and test data, while application of multiple kernel SVR 
results in the best results for both of these databases. Prediction results also prove higher accuracy of the 
multiple kernel SVR besides its enhanced generalization capability over the single kernel SVM and ANN 
regression approaches. Based on the multiple kernel SVM models, the input parameters can be tuned to 
obtain a desired weld geometry in this manufacturing process with a higher degree of accuracy. 
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